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1 Introduction 



1.1 Background 

For a prime p and an integer u with gcd(w,p) = 1 the Fermat quotient q p (u) 
is defined as the unique integer with 

q p (u) = - — - — - (mod p), < q p (u) < p - 1, 
and we also define 

q p (kp) = 0, k G Z. 

It is well-known that the ^-divisibility of Fermat quotients q p (a) by p 
has numerous applications, which include the Fermat Last Theorem and 
squarefreeness testing, see [131 [13 Ed Elj • In particular, the smallest value 
£ p of u > 1 for which g p («) ^ (mod p) plays a prominent role in these 
applications, for which the following estimates are given [5] 

t < f (logp) 463 / 252+0 « for all p, 
p ~ \ (logp) 5 / 34 " ^) for almost all p, 

(where almost all p means for all p but a set of relative density zero) , which 
improve the previous estimates of the form £ p = O ((logp) 2 ) of [T51 [TBI |2"T| 121] . 
It is widely believed that l p = 2 for all primes p, except for a very thin set 
of so called Wieferich primes, which one expects £ p = 3 (in particular, it is 
expected that £ p < 3 for all primes). The behaviour (and even the infinitude) 
of Wieferich primes is still very poorly understood, although several inter- 
esting results, relating Wieferich primes to other number theoretic problems 
are known, see [1511251125]. 

There are also several results about the distribution of Fermat quotients. 
For instance, Heath-Brown [20] has proved that the Fermat quotients q p (u) 
are asymptotically uniformly distributed (after scaling by 1/p and mapping 
them into q p (u) jp € [0, 1]) for u — M + 1, . . . , M + N for any integers M and 
N > p 1 ' 2 ^ for some fixed e and p — > oo. Note that [2U1 Theorem 2] gives 
this only for iV > p 3 / 4+£ but using the full strength of the Burgess bound 
one can lower this threshold down to h > p 1 / 2+e ; see Lemma [2] below and 
also [131 Section 4]. 

It is also shown in [TJl Proposition 2.1] that for any integer a the number 
of solutions to the equation q p (u) = a, < u < p, is at most 

#{n G {0, . . . ,p - 1} : q p (u) = a} < p 1 / 2 ^ . (1) 
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Finally, we also recall several results on congruences involving Fermat 
quotients, see [3J HJ EI] and references therein. 



1.2 Our results 

Here we consider the dynamical system generated by Fermat quotients. That 
is, we fix a sufficiently large prime p and, for an initial value Uq G {0, . . . , p— 1} 
we consider the sequence 



Clearly, there is some t such that u% = Uk for some k < t. Then u n+ t = 
u n+ k for any n > 0. Accordingly, for the smallest value of t with the above 
condition, we call u , . . . , u t -\ the orbit of the initial value u . 

Here we address various questions concerning the sequences generated 
by (EJ) such as the number of fixed points, image size and the "typical" orbit 
length. In particular, we compare their characteristics with those expected 
from random maps, see [H]. All our numerical results support the natural 
expectation that the map u i— > q p (u) behaves very similar to a random map 
on the set {0, . . . ,p — 1}. 

We also investigate their distribution and other characteristics which are 
relevant to their use as pseudorandom number generators. As we have men- 
tioned, a result of Heath-Brown [20] implies that the fractions q p (u)/p are 
uniformly distributed for u = M + l, ...,M + N, provided that N > p 1 / 2+e 
for some fixed e > 0. However, the method of |20j, based on bounds of multi- 
plicative character sums, such as the Polya- Vinogradov and Burgess bounds, 
see j22j Theorems 12.5 and 12.6], does not seem to apply to studying the 
distribution of several consecutive elements (as it is essentially equivalent to 
estimating short sums of multiplicative characters modulo p 2 with polyno- 
mial arguments). Here we use a different approach, to study the distribution 
of points 



in the s-dimensional cube, which is nontrivial provided that N > p 1+£ for 
any fixed real e > and integer s > 1. 

We also obtain a nontrivial lower bound on the linear complexity of the 
sequence q p (u) which is also a very important characteristic of any sequence 





(2) 





(3) 
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relevant to its applications to both cryptography and Quasi-Monte Carlo 
methods, see [23 E2] • 

Besides theoretic estimates, we also present results of several numeri- 
cal tests. Some of these tests are based on a modification of an algorithm 
described in [121 H3] , which seems to be more computationally efficient. We 
also address some other algorithmic aspects of computation with Fermat quo- 
tients. In particular, we give asymptotic estimates of several new algorithms 
which we design for this purpose. 

We note that all heuristic predictions concerning various conjectures about 
Fermat quotinets (for example, the expected number of Wieferich primes up 
to x as x — > oo) are based on the assumption of the pseudorandomness of 
the map u \- > q p {u). Our results provide some theoretic and experimental 
support to this assumption which seems to be never systematically verified 
prior to our work. 

Finally, motivated by the pseudorandom nature of the map u y q p (u), 
we also discuss some possibilities of using Fermat quotients for designing 
cryptographically useful hash functions. 

We remark that Smart and Woodcock [33J have considered iterations of 
a related function 



in the ring of p-adic integers. However, the setting of [33] (where p is fixed, 
for example p = 2) and our settings where p is the main growing parameter 
are very different. 
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2 Preparations 



2.1 General Notation 

Throughout the paper, p always denotes prime numbers, while k, m and n 
(in both the upper and lower cases) denote positive integer numbers. 
For integers a, b and m > 1 with gcd(6, m) = 1, we write 

c = a/b rem m 

for the unique integer c with bc = a (mod m) and < c < m. 
We also define 

e p (z) = exp{27riz/p). 

The implied constants in the symbols L 0\ and £ <C' may occasionally de- 
pend on an integer parameter s and are absolute otherwise. We recall that 
the notations U = 0(V) and V <C U are both equivalent to the assertion 
that the inequality \U\ < holds for some constant c > 0. 

2.2 Discrepancy and linear complexity 

Given a sequence r of N points 

r={(7n,l,---,7nX=o} (5) 

in the s-dimensional unit cube [0, l) s it is natural to measure the level of its 
statistical uniformity in terms of the discrepancy A(T). More precisely, 



A(r) = sup 

BC[0,1) S 



N 

where Tp(B) is the number of points of T inside the box 

B = [a lt Pi) x ... x [a s ,(3 s ) C [0, l) s 

and the supremum is taken over all such boxes, see [HI 123] . 

Typically the bounds on the discrepancy of a sequence are derived from 
bounds of exponential sums with elements of this sequence. The relation 
is made explicit in the celebrated Erdds-Turan-Koksma inequality, see [TTJ 
Theorem 1.21], which we present in the following form. 
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Lemma 1. For any integer H > 1 and any sequence T of N points (J5J) the 
discrepancy A(T) satisfies the following bound: 



\ 0<|h|<Hi=l 1 



N-1 

i=i 



n=0 



he) G Z s wt/i 



where the sum is taken over all integer vectors h = (hi, 
|h| = max,, =1) s \hj\ < H. 

Finally, we recall that the linear complexity L of an iV-element sequence 
So, ... , sat_! in a ring TZ is defined as the smallest L such that 

S u +L = CL-lS u+ L-l + • • • + c s u , 0<u<N-L~1, 

for some Co, . . . , cl_i G 7?., see [H [251 132] - 



2.3 Exponential sums 

First, we recall the bound of Heath-Brown [20] on exponential sums with 
q p (u). Although here we use it only with v = 2 (exactly as it is given in |20j) 
we formulate it in full generality. 

As we have mentioned, the method of Heath-Brown |20j combined with 
the Polya- Vinogradov bound (when v = 1) and the Burgess bound (when 
v > 2), see [22l Theorems 12.5 and 12.6], implies the following generalisation 
of [201 Theorem 2]: 

Lemma 2. For any fixed integer v > 1, we have 

M+N 
u=M+l 

osp-> oo ; uniformly over M and N > 1. 

We now recall the following well-known bound, see 
Lemma 3. For any integers K and r, we have 

p_ 

\r\ 



max 

gcd(a,p)=l 



< jv 1_1/!y p (!y+1)/2iy2+o(1) 



Bound (8.6)]. 



e p (kr) min < K 



k=0 



where 



r = mm r — sp\ 
is the distance between r and the closest multiple of p. 
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2.4 Basic properties of Fermat quotients 

Most of our results are based on the following two well-known properties of 
Fermat quotients. 

For any integers k, u and v with gcd(uv,p) = 1 we have 

q p (uv) = q p (u) + q p (v) (mod p) (6) 

and 

q p {u + kp) = q p {u) — ku~ x (modp), (7) 
see, for example, p21 Equations (2) and (3)]. 

3 Dynamical Properties 
3.1 Computation of q p (u) 

As we have mentioned, computing each individual value of q p (u) can be 
done in O(logp) arithmetic operations on 0(logp)-bit integers via repeated 
squaring computation of u v ~ x modulo p 2 , we refer to [16] for a background on 
modular arithmetic and complexity of various algorithms. In particular, one 
can easily reformulate our complexity estimates in terms of bit operations. 

Thus computing all values of q p (u), < u < p, requires 0(p\ogp) arith- 
metic operations on 0(logp)-bit integers. Such computation is necessary, for 
example, to find all fixed points of the map u i-> q p (u) or for finding the 
image size. 

Here we show that there is a slightly more efficient algorithm which is 
based on and ([7]). 

We assume that we are given a primitive root g modulo p. This can be 
done at the pre-computation stage and we keep it outside of the algorithm 
(in any case, it can be found in p 1 / 4 +°( 1 ) arithmetic operations on 0(logp)-bit 
integers, see [27], which is lower than the remaining parts of the algorithm). 

Algorithm 4 (Generating q p (u), < u < p — 1). 

Input: A prime p and a primitive root g modulo p with 1 < g < p. 
Output: A permuted sequence of the values q p {u), < u < p — 1. 
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1. Set q p (0) = and q p (l) = 0. 

2. Compute q p (g) using the repeated squaring modulo p 2 . 

3. Set bi = g and c\ = g^ 1 rem p. 
4- For i — 2, . . . , p — 2 compute 

(a) bi = gb,i_i rem p and Ci = Cj_ig _1 rem p; 

(b) h = (gbi-i - bi)/p; 

(c) q P {h) = q p {g) + q p {bi-i) + foe* rem p. 

Theorem 5. Algorithm^ computes every value q p {u), < u < p — 1, in 
O (p) arithmetic operations on 0{\ogp)-bit integers. 

Proof. The complexity estimate is immediate. The correctness of the algo- 
rithm follows from the congruences 

q P {h) = q P {gk-i - hp) = q P {gk-i) + k^gb^y 1 

= Qp(g) + q P {k-i) + hci (modp), 
which in turn follow from (J6j) and (JTj). □ 

Note that the algorithm of [12j [13] is very similar, except that it uses 
g = 2 instead of a primitive root. This makes each step faster, but if 2 is not 
a primitive root modulo p requires going trough all conjugacy classes of the 
group generated by 2 modulo p and thus requires more "administration" of 
data and also more memory. 

Unfortunately Algorithm H] does not help to compute q p (u) for a given 
value of u unless all values q p {v ), < v <p — l, are precomputed and stored 
in a table, after which q p (u) can simple be read from there. We now describe a 
trade-off algorithm which requires less memory but the computation of q p (u) 
is more expensive than the simple table look-up. It depends on a parameter 
z > 2, which can be adjusted to particular algorithmic needs. 

For a real V < p we use Q P (V) to denote the table of the values of q p (y) 
with v G [0, V]. We see from Theorem \5\ that Q P (V) can be computed in 
O (min-jjo, Vlogp}) arithmetic operations on 0(logp)-bit integers. 

Furthermore, for an integer m, we use T m {V) to denote the table of 
the values v _1 rem m with v 6 [1,V] and gcd(t>,m) = 1. Since by the 
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Euler theorem v _1 = v v<yTn ^~ l (mod m), where ip(m) is the Euler function, 
we see that T m (V) can be computed in O (Vlogm) arithmetic operations 
on 0(logm)-bit integers (there are even more efficient modular inversion 
algorithms with a better bound on the number of bit operations, see |I6]; 
however using them does not change the overall complexity of our algorithm) . 

Algorithm 6 (Computing q p (u) for a given u £ [0,p — 1]). 

Input: A prime p, a real z > 2, the tables Q p {p/z), T p (p/z), X p 2(z) and an 
integer u £ {0, . . . ,p — 1}. 

Output: The value of q p {u) . 

1. If u = set q p (u) = 0. 

2. Find integers v and w with u = v/w (mod p) and such that 1 < v < 
Ipjz and \w\ < z. 

3. Recall r = ffi _1 rem p 2 if w > or r = — ((— w)' 1 rem p 2 ) if w < 
from the table T p 2(z) . 

4- Compute s with s = v/w (mod p 2 ) and such that < s < p 2 . 

5. Compute k = (s — u)/p. 

6. Recall r = v _1 rem p from the table T p (p/z). 

7. Recall q p {y) and q p {w) from the table Q p (p/z). 

8. Compute q p {u) = (q p (v) — q p {w) + krw) rem p. 

Theorem 7. For any integer u with < u < p — 1, Algorithm^ computes 
q p (u) in 0(logz) arithmetic operations on 0(\ogp)-bit integers. 

Proof. The correctness of the algorithm follows from the congruences 

q p (u) = q p (s - kp) = q p (s) + ks' 1 

= q p {v) — q p (w) + kv~ w = q p (v) — q p (w) + krw (mod p) 

which in turn follow from ([6l) and ((7|). 
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It remains to estimate the complexity of finding the v and w with u = v/w 
(mod p). We can also assume that z < p since otherwise the result is trivial. 
We start computing continued fraction convergents di/bi, gcd(aj,6j) = 1, 
% — 1,2, . . ., to u/p, see, for example, (30] for basic properties of continued 
fractions. We define j by the condition 

bj < z < bj + \. 

By the well-known property of continued fractions, we have 



dj u 



bj p 



1 1 

< < 



We now define 

and note that (since z < 0) 



bjbj+x b jZ 

W = \djP — bjU\ 



< w = bjp 



< 



P 



Furthermore uv = w (mod p) for either v — aj or v — —dj. Finally, since 
the denominators of the convergents grow at least exponentially, we see that 
j = O (log bj) = O(logz) and thus find dj and bj in 0(logz) steps, each of 
them requires to compute with 0(logp)-bit integers. □ 

We see from Theorem [7] taken with z = exp (y/log p) , that evaluating (in 
time pexp (—(1 + o(l))y/\og p)) and storingpexp (—(1 + o(l))y/\ogp) values 
of Fermat quotients, we can compute any other value in time (logp) 1 ^ 2+o( - 1 \ 



3.2 Fixed Points 

Let F(p) denote the number of fixed points of the map q p (u) that is, 
F(p) = #{ue{0,...,p-1} : q p (u) = u}. 
We derive a nontrivial estimate on F(p) from Lemmas [1] and [2] 
Theorem 8. We have 

F(p) « pU/mod) 

as p —7- oo. 
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Proof. Let us choose some positive integer parameter N G [l,p— 1] and for an 
integer M we denote by T(p; M, N) the number of integers u G [M+l, M+N] 
with q p (u) G [M + 1, M + N]. Considering the discrepancy of the fractions 
q p (u)/p, u — M + l, ...,M + N and combining Lemma [1] (taken with s — 1) 
with Lemma [2] (taken with v — 2) , we immediately conclude 

N 2 

T{p; M,N) = — + (jvVy/8+o(D) . 
p 

Clearly every u = M + 1, . . . , M + N which is a fixed point contributes to 
T(p; M, N). Covering the interval [0,p — 1] with at most (p/N + 1) intervals 
of length h we obtain 

F(p) < + l) (¥■ + O (ivV^s/s+oCi)^ 

Choosing = [p 11 / 12 ] , we conclude the proof. □ 

There is little doubt that the bound of Theorem [8] is very imprecise. It 
is easy to see that in the full range < u < p 2 — 1 the relation (J7|) implies 

#{w G {0, . . . ,p 2 — 1} : q p (u) = u (mod p)} — 2p — 1. 

Indeed, it is enough to write u = v + kp with v,k E {0, . . . , p — 1} and notice 
that 

• either t> = and then A; can take any values 

• or v > and then the relation ([7j) identify /c uniquely. 

Thus one can expect that F{p) = 0(1). 

In fact it seems reasonable to expect that the map u h-> q p {u) behaves 
similar to a random map. We recall that for a random map on m elements, 
the probability of having k fixed points is 

m m \kj v 7 efc! 

as m — )• oo. 

Below we present numerical results giving the numbers N(k) of primes 
p G [50000, 200000] for which the map u i— )■ has exactly -FQo) = A; 

fixed points (note that we discard the "artificial" fixed point u — 0). We 
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also give the proportions of such primes p(k) = N(k)/N where N = 12851 
is the total number of primes p G [50000, 200000] and compare them with 
Po{k) = (ekl)' 1 for k — 0, ... ,6. We note that in the above range N(k) = 
for k > 7. 



k 





1 


2 


3 


4 


5 


6 


Po(k) 


0.368 


0.368 


0.184 


0.0613 


0.0153 


0.00306 


0.000511 


N(k) 


4770 


4697 


2327 


844 


174 


36 


3 


p(k) 


0.371 


0.365 


0.181 


0.0656 


0.0135 


0.00280 


0.000233 



Statistics of fixed points 



These numerical results appear to indicate a reasonable agreement be- 
tween the prediction and actual results. 

3.3 Concentration of values 

For integers k and h > lwe denote by U(p; k, h) the number of u G {0, . . . , p— 
1} for which q p (u) = z (mod p) for some z6 [k + 1, k + h] . 

As in the proof of Theorem [HI a combination of Lemma [2] (which we take 
with N = p and v — 2) with Lemma[T]gives the following asymptotic formula 

U{p;k,h) = h + 0{p 7/8+o{1) ) (8) 

as p — > oo. On the other hand, using ([I]), we trivially obtain 

U(p; k, h) < hp 1/2+o{1) 

that improves ([H]) for h < p 3 ^ 8 . 

We now obtain a better upper bound, which improves ([8]) for h < p 3 ^. 

Theorem 9. For any integers k and h > 1, we have 

^(p; M) < /> v y /2+o(1) 

as p — > oo . 

Proof. Let U be the set of u G {0, . . . , p— 1}, which are counted by U(p; k, h). 
Using ([6]) we see that any w of the form w = uv with uv G U satisfies 
< w < p 2 — 1 and 

q p (w) = z (mod p) (9) 
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for some z G [2k + 2, 2k + 2/i]. For a fixed integer z, there are 0(p) values of 
wG{0,...,p 2 — 1} satisfying 0, which follows immediately from (0) (see 
also the proof of [151 Proposition 2.1]). So there are at most O(hp) values of 
w satisfying ([9]) with some z G [2/c + 2, 2fc + 2/i]. Using the classical estimate 

t(w) = w°^\ w — > oo, 

on the divisor function t(w) (see [221 Bound (1.81)] with k = 2), we deduce 
that each w = uv can be obtained from no more than distinct pairs 
(u,v) G U 2 . Therefore (#W) 2 < which concludes the proof. □ 



3.4 Image size 

Let M{p) be the image size of the q p (u) for < u < p — 1, that is 

M{p) = #{q p {u) : 0<u<p-l}. 

The bound ([T|) immediately implies M(p) > p 1 / 2 +°( 1 ). In fact more precise 
bounds 

y/p - 1 < M ( P ) < p - y/(p-l)/2 

can be obtained from ([6]) and j7j), see [131 Section 3]. 
We now obtain a stronger lower bound on M(p). 

Theorem 10. We have 

P 



M{p) > (l + o(l))- 



(logp) 2 
as p oo. 

Proof. Let Q(p, a) be the number of primes £ G {1, ... ,p — 1} with g p (£) = a 
(note that we have discarded u = 0). Clearly 

p-i 

^Q(p,a)=7r(p-1) (10) 

a=0 

where, as usual, ir(x) denotes the number of primes £ < x, and also 

p-i 

^g(p,a) 2 = #^(p), (11) 

a=0 
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where 

7Z(p) — {(£, r) : 1 < £, r < p — 1, £, r primes q p (£) = q P ij)}. 
We see from ([6]) that if (£, r) G TZ(p) and 

w = £/r (mod p 2 ) (12) 

then 

q p (w) = q p (£) - q p (r) = (mod p). 
Since all w with g p (ti?) = (mod p) and gcd(w,p) = 1 have 

w p_1 = 1 (mod p 2 ), 

they are elements of the group Q p of the pth power residues modulo p. Thus 
we see from f[T2~j) that 

#ft(p) < N(p), 
where N(p) is the number of solutions (£, r, w) to 

w£ = r (mod p 2 ), where £,r < p — 1, £,r primes, w G (13) 

We note that for w = 1 (mod p 2 ) there are exactly 7r(p — 1) pairs (£, r) 
with £ = r that satisfy ( fTBl) . For any other w G if ( fTBl is satisfied for 
(£i,ri) and (€ 2 ,r 2 ) then 

£ x r 2 = ^2 r i (mod p 2 ) 

which in turn implies the equation 

£ir 2 = t*r x (14) 

(since 1 < l\,liT\,ri < P _ !)• Because ^1,^2^1,^2 are primes, we see 
from (j!4p that either (£±,£2) = {ri,r 2 ), which is impossible for w ^ 1 
(mod p 2 ), (^i,ri) = (£2^2), which means that when w G \ {1} is fixed, 
then ( {TBI is satisfied for at most one pair of primes (£,r). Therefore 

#n(p) < N(p) < 7r(p - 1) + #£ p - 1 = p + 0(p/ logp). (15) 

Now, since by the Cauchy inequality we have 

/p-1 \ 2 P-i 

^g(p,a) <M(p)^g(p,a) 2 , 

\a=0 / a=0 

14 



recalling (ITUj) and (ITT]) and using (IT5|) . we obtain 

Af(p) > (1 + o(1))tt(p - 1) V 1 - 
which concludes the proof. 



□ 



Clearly the bound of Theorem [TD] is not tight. The image size M m of a 
random map on an m element set is expected to be 



Mr, 



1 - - ) m = 0.63212. . .m 

e 



see [T4t Theorem 2] , and thus it is reasonable to expect that M(p)/p « 1— 1/e. 

We now give the average value of M(p)/p taken over primes p in the 
intervals 

^=[500001,50000(1 + 1)], « = 1,2,3. 
and the whole interval 

J = [50000,200000]. 



(16) 
(17) 



Range 


Ji 




Ja 


J 


# of primes 


4459 


4256 


4136 


12851 


M(p)/p 


0.63212 


0.63208 


0.63212 


0.63211 



Statistics of image sizes 



3.5 Distribution of orbit lengths 

For any map / defined on an m element set, and any initial value Uq from 

this set, we consider the iterations Ui = i = 1, 2, Then for some 

p > n > we have u p = u^. The smallest value of p is called the orbit length 
and the corresponding (and thus uniquely defined) value of p is called the 
tail length. 

By p~H Theorem 3] the expected values p m and p m of the orbit and tail 
length, taken over all random maps and initial values Uq, satisfy 

^ = v ^72 + o(l) and ^ = v ^78 + (l), 
as m — > oo. 

Here we present the results of computation of the average values of the 
orbit and the tail lengths, scaled by y/p, for the sequence (T5]) taken over primes 
p in the intervals J7i, 3-$ and J7", given by f|T6|) and (fT7|) . respectively, and 
a randomly chosen initial value Uq G [l,p — 1]. 



15 



Range 


Ji 


J2 


Jz 


J 


# of primes 


4459 


4256 


4136 


12851 


p/Vp 


1.2423 
0.62179 


1.2445 
0.62200 


1.2444 
0.61806 


1.2437 
0.62066 



Statistics of orbit and the tail lengths, random uq 



Since the values q p (2) are of special interest, we also present similar data 
where the inutial value is alway chosen as Uq = 2. 



Range 


Ji 


J2 


Js 


J 


# of primes 


4459 


4256 


4136 


12851 


p/VP 


1.2381 
0.61778 


1.2507 
0.63004 


1.2401 
.62060 


1.2429 
0.62275 



Statistics of orbit and the tail lengths, uq = 2 



The results show quite satisfactory matching with the expected values of 

= 1.2533 .. . and = 0.62665 ... . 

Furthermore, we also give similar average values for C(p)/p, where C(p) 
is the total number of cyclic points in all possible trajectories of the map 
u i — y q p (u) on the set {0, . . . , p— 1}, taken over primes from the same intervals 
Ji, J2, Jz and J. 



Range 


Jl 


Ji 


Jz 


J 


# of primes 


4459 


4256 


4136 


12851 


c(p)/Vp 


1.2413 


1.2527 


1.23706 


1.2437 



Statistics of cyclic points 



By [HI Theorem 2] the number C m of cyclic nodes of a random map on 
an m element set is expected to be 

C m = ^J2m = 1.2533. .., 

which again is very close to the observed average values. 
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4 Pseudorandomness 



4.1 Joint distribution 

For integers M, N > 1, s > 1 and an integer vector a = (ao, . . . , a s -i) we 
consider the exponential sums 

M+jV /s-l 

S s ,p(M, N; a) = e p I + 

u=Af+l \i=0 

Thus the above sums are generalisations of those of Lemma [2] that corre- 
spond to the case s — 1. However the method of Heath-Brown [20] does 
not seem to apply to the sums S S)P (M, N; a) as it requires good estimates 
of mulitiplicative character sums with polynomials, which are not currently 
known (see however [6] for some potential approaches in the case s — 2). 

We are now ready to prove an estimate on S S ^ P (M, N; a) which together 
with Lemma [TJ implies an upper bound on the discrepancy of points 

Theorem 11. For any integer s > 1, we have 

max \S S>P (M, N; a)| <C splogp 

gcd(ao,...,a s _i,p)=l 

uniformly over M and p 2 > N > 1 . 

Proof. Select any a = (ao, . . . , a s -i) £ ^ s with gcd(a , . . . , a s -i,p) = 1 and 
take K = [N/p\ . We get 

M+Kp /s-l \ 

S S>P (M, N; a) = e P [J2 a tfp( u + 3))+ 0(p) 

u=M+l \j=0 J 
Kp /s-l \ 

= J2 e p\Yl a M u + M + j)\+ 0{ P ) 

u=l \j=0 / 
p K-l /s-l \ 

= HH e p\Y. a rfp( v +m+j + kp)) + o(p). 

v=l k=0 \j=0 J 

Let V be the set of v = 1, . . . ,p with v ^ —M — j (mod p) for any 
j — 0, . . . , s — 1. Therefore, using (JTj), we obtain: 

S S)P (M,N;a) = W + 0(p + sK), (18) 
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where 

K-l / s-1 

W = E E M ^2( a j1p( v + M + j)- aj k{v + M + j)- 1 ) 
vev k=o \j=o 

(5-1 \ A"-l / s-1 

J2a jgp (v + M + j) \J2 eJ-kYs^ + M + j)- 1 ) ) . 
3=0 / k=0 \ i=0 

Taking now the absolute value, we obtain 

K-l 



\w\ < £ 



fc=0 V j=0 



Recalling Lemma El we deduce 



where 



s-1 

a 



rQ V + M + j 

Examining the poles of F^ s (v), we see that if gcd(ao, . . . , a s _i,p) = 1 then 
it is a nonconstant rational function of degree O(s) modulo p. Thus every 
residue modulo p occurs 0(s) times among the values F a>s (v), v G V. Hence 

p-i 



W\ <C s min < K, -p— > <C splogp 

,._n I II II J 



M=0 

which concludes the proof. □ 

Using Lemma (JTJ, we immediately obtain: 

Corollary 12. For any /ixed s, the discrepancy A PtS (M, N) of points ([3]) 
satisfies 

A P>S (M,N) < iV-Vlogp) s+1 , 
uniformly over M and p 2 > N > 1 . 
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4.2 Linear complexity 

Here we estimate the linear complexity for a sufficiently long sequence of 
consecutive values of q p (u). 

Theorem 13. For p 2 > N > 1 the linear complexity L p (N) of the sequence 
q p {u), u = 0, . . . , N — 1, satisfies 



L P (N) > - min{jo - 1, N -p - 1}. 



f 
2 

Proof. Assume that 

L 

^2cjq p {u + j) = (modp), <u<N -L-l, (19) 

3=0 

for some integers Co, . . . , cl-i and cl = — 1. Let i? = minjjo — L, N — L — p}. 
Then we see from (fT9l) that for 1 < u < R — lwe have 

L 

^Cjg p (M + p + j) = (modp). (20) 
i=o 

Recalling (J7|) and using (fT9]) again, we now see that 

L L 

Yl c j^ u +p+j) = Yl c i M M + ft {u -\- jr 1 ) 

3=0 j=0 

L 

= -J^c 7 -(it + (modp). 

i=o 



(21) 



Comparing (120]) and (1211) we see that 

^^■(u + j)" 1 = (modp), l<M<i2-l. 

3=0 

We can assume that L < p since otherwise there is nothing to prove. Clearing 
the denominators, we obtain a nontrivial polynomial congruence 



2^ Cj Y\( u + h) = (mod p), 



j=0 h=0 
h+3 
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of degree L, which has R—l solutions (to see that it is nontrivial it is enough 
to substitute u = in the polynomial on the left hand side). Therefore 
L > R — 1 and the result follows. □ 

The argument used in the proof of Theorem [T5] can also be used to es- 
timate the linear complexity of arbitrary segments of the sequence q p (u), 
although the resulting bound is slightly weaker. 

Theorem 14. For M and p 2 > N > 1 the linear complexity L P (M; N) of 
the sequence q p {u), u — M + 1, . . . , M + N, satisfies 



L p (M; N) > min 



p — 1 N — p — 1 



Proof. Assume that 

L 

Cjq p {u + M + j) = (mod p), 1 < u < N - L, (22) 

3=0 

for some integers cq, . . . , cl-i and cl = — 1. Let R = min{p, iV — L — p}. 
Then we see from (1221) that for 1 < u < R we have 



L 

Cjq p (u + M + p + j) ee (modp). (23) 

3=0 

Recalling (JTj) and using ( J22|) again, we now see that for any integer u with 
u ^ — M — j (mod p), j = 0, . . . , L, we have 

L L 

Cj g p ( M + M + p + j) ee Cj (q p (u + M + j)-(u + M + j)' 1 ) 

3=0 j=0 



L 



(24) 



Y]cj(u + M + j)' 1 (modp). 

3=0 



Comparing (|23|) and (1241) we see that 



^Cj(u + M + jy 1 ee (modp), 
i=o 
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for at least R — L — 1 values of u with 

1 < u < R and u ^ — M — j (mod p), j = 0, . . . , L. 

As before we can assume that L < p since otherwise there is nothing 
to prove. Clearing the denominators, we obtain a nontrivial polynomial 
congruence 

L L 

^2 c i Yli u + M + h ) = ° ( m ° d p) 

j=0 h=0 

of degree L, which has at least R — L — l solutions (to see that it is nontrivial 
it is enough to substitute u = —M in the polynomial on the left hand side). 
Therefore L > R — L — 1 and the result follows. □ 



5 Hash Functions from Fermat Quotients 
5.1 General Construction 

In this section we propose a new construction of hash functions based on 
iterations of Fermat quotients. A similar idea, however based on a very 
different family of functions, has been previously introduced by D. X. Charles, 
E. Z. Goren and K. E. Lauter [7]. 

Let n and r be two positive integers. Choose 2 r random (n+l)-bit primes 
Po, . . . ,P2 r -i- We also consider a random initial n bit integer uq. 

The has function is built from a sequence of iterations of Fermat quotients 
moduli po, . . . ,p2 r -i- As in [7j, the input of the hash function is used to 
decide what modulo what prime the next Fermat quotient is computed. More 
precisely, given an input bit string S, we perform the following steps: 

• Pad E with at most r — 1 zeros on the left to make sure that its length 
L is a multiple of r. 

• Split E into blocks <jj, j = 1, . . . , J, where J = L/r, of length r and 
interpret each block as an integer I G [0, 2 r — 1]. 

• Starting at the point uq, apply the Fermat quotient maps q Pe iteratively 
by using n least significant bits of Uj-i to form an n-bit integer Wj-i 
and then computing 

uj = q Pe {wj-i)- 
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• Output the last element in the above sequence, that is, uj = q PJ {wj-i) 
and outputing its n least significant bits as the value of the hash func- 
tion. 

5.2 Collision Resistance 

We remark that the initial element uq is fixed and in particular, does not de- 
pend on the input of the hash function. Furthermore, the collision resistance 
is based on the difficulty of making the decision which Fermat quotient to 
apply at each step when one attempts to back trace from a given output to 
the initial element Uq and thus produce two distinct strings Si and S 2 of the 
same length L, with the same output. 

Note that for strings of different lengths, say of L and L + l, a collision can 
easily be created. It is enough to take S 2 = (0, Si) (that is, S 2 is obtained 
from Si by augmenting it by 0). If L ^ (mod r) then they lead to the same 
output. Certainly any practical implementation has to take care of things 
like this. 

We also note that the results of Section 0] suggest that the above hash 
functions exhibit rather chaotic behaviour, which close to the behaviour of 
a random function. It is probably too early to make any suggestions about 
the applicability of Fermat quotients for hashing but this direction definitely 
deserves further studying, experimentally and theoretically. 

6 Comments 

Unfortunately we are not able to give any estimates on the discrepancy or 
linear complexity of the orbits 02]), which is a very interesting but possibly 
hard, question. 

Obtaining analogues of Theorems [TTJ [13] and [TU which are nontrivial for 
iV < p is another interesting question. 

The method of proof of Theorems [HI] and does not apply to the non- 
linear complexity. We recall the nonlinear complexity of degree d of an 
iV-element sequence so, ■ ■ ■ , sjv-i of elements in a ring 1Z is the smallest L o 
such that 

s u+L = tf)(s u+L -i, ...,s u ), < u < N - L - 1, 
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where if) G 1Z\Y%, . . . , Y L ] is a polynomial of total degree at most d. Estimating 
the nonlinear complexity of Fermat quotients is of ultimate interest. 
Finally, we remark that one can also study the sums 



M+N 



T p (M,N; X )= X(q p (u)) 



u=M+l 



with a nonprincipal multiplicative character x modulo p. Arguing as in the 
proof of Theorem [11] we get 



M+p-l 

\T p (M,N; X )\< 

v=M+l 



K-l 



J2x{q P (v + M)-k(v + M)- 1 )) 



k=0 



P, 



where K = \_N/p\. One can now apply the Burgess bound, see [22j The- 
orems 12.6], and get a nontrivial estimate on T P (M, N;x), starting with 
N > p 5 / i+£ for any fixed e > 0, see [28J. However it is natural to expect 
that one can take advantage of additional averaging over v and get a non- 
trivial bound for smaller values of N. Furthermore, using (JBJ) it is possible 
to estimate bilinear character sums 

W P (A,B,U,V; X )= E «AX(%H) 

0<u<U 0<v<V 

with arbitrary complex weights A = (a u ) and B = (f3 v ), and then using the 
Vaughan identity, see [221 Section 13.4], estimate the character sums with 
Fermat quotients at primes arguments, see [2B] for details. 

Furthermore, we remark that studying the map x i— > (x p ~ 1 — l)/p in 
the field of p-adic numbers, is also of great interest, see [33] where a similar 
question is considered for the maps given by (j4j). The other way around, it 
is also quite natural to study the map modulo p. 

Finally, analogues of Fermat quotients modulo a composite number is 
certainly an exciting object of study with its own twists, see [H [21 IH ED]- 
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